Skip to content

Conversation

@hwsmm
Copy link
Contributor

@hwsmm hwsmm commented Sep 23, 2023

Please note that this is my first proper C# project. I got interested by #2784, and I started learning C# just because I wanted to try making it.
I ended up making it work somehow, but I just have no idea about what I need to do to get further.

Description

This PR separates AudioManager into BassAudioManager and AudioManager, and adds SDL3AudioManager and some more components.
Changing AudioDriver to SDL3 in framework.ini enables this feature. BASS is used as default without that option.

Track/Sample instances now actually have their raw audio data, so they are managed by C# rather than by BASS. Due to this, it's technically SDL audio implementation, but SDL doesn't manage samples/tracks. Most logics including mixing are now in C# code, so it should be fairly easy to change to another audio library as long as it supports queueing audio directly like SDL.

This mainly benefits Linux users as SDL supports PipeWire, PulseAudio, ALSA and JACK natively, whereas BASS only supports ALSA.

Choosing audio decoder

Either BASS or FFmpeg is usable. If you want to use FFmpeg, you need to bring proper FFmpeg binaries. osu!framework uses cut-down FFmpeg only for videos. libswresample is also needed.

You can enable FFmpeg by removing BASS init lines in SDL3AudioDecoderManager. BASS is used as default otherwise.

Used libraries

SoundTouch.NET is used to adjust track tempo. NAudio is used to apply BiQuad filters, adjust frequency of tracks and samples, and perform FFT for Waveform/CurrentAmplitude. SDL is used to push audio to actual audio server. (No native libraries added!)

They are all open-source, so it should be much easier to track down bugs.

Tested platforms

Tested on Windows 11, Linux and Android. Should be also usable in iOS and macOS, but I don't have any Apple device to test on.

Notable changes to existing components

  • VideoDecoder can now decode audio (meant to separate, but it would make too much diff)
  • Waveform is not dependent on BASS anymore, but performance in benchmarks is about 1.3x worse (70ms versus 53ms on previous implementation)

TODO

This PR requires no change in osu! itself, but may require in future as the game uses ManagedBass.Fx directly.

What's not working

  • Any audio effects other than BiQuad ones

Some quirks

  • Audio stutters a bit when GC is happening and buffer is small

Needs to be done in future

  • Change to the original SoundTouch instead of .NET port to get some optimization: my old code used this, and it works as good as .NET one, but I didn't want to add a native library in this PR.
  • Make VideoDecoder abstract, and separate audio part from it
  • BASS tests are reused to reduce diffs, may need to separate in future.

@hwsmm hwsmm force-pushed the sdl-audio branch 2 times, most recently from 9d150b0 to 445ec95 Compare September 23, 2023 18:21
@smoogipoo
Copy link
Contributor

This is super cool, though I don't see osu! using this in the near future if ever. I'd see this implemented as an additive nuget package, and exposed through some way that isn't the framework config. For example, it could be included in HostOptions or simply as a virtual method in Game.

@peppy
Copy link
Member

peppy commented Sep 25, 2023

I'd actually like to experiment with this and see how good support is for the upcoming changes I want to make with WASAPI initialisation. So I don't want to throw this out. Moving away from bass would be a huge consideration, but I wouldn't throw it away.

I'm going to mark this as a draft as I don't see it getting reviewed or merged anytime soon, but I still useful to have around as a reference for what is involved in making this work, and potential performance / latency cross-checking in a future.

@peppy peppy marked this pull request as draft September 25, 2023 04:05
@hwsmm
Copy link
Contributor Author

hwsmm commented Sep 29, 2023

I'll maintain this until some of you have anything to do with this, mostly because I am now so used to playing the game with these patches, I can't really go back to BASS...
FWIW, on Linux with Pipewire at very small buffer, hitsound latency was almost on par with patched wine which osu! linux players mostly use. It produces some artifacts, so shouldn't be default, though.

@hwsmm hwsmm force-pushed the sdl-audio branch 5 times, most recently from fe6c986 to d19377f Compare October 3, 2023 07:30
@hwsmm hwsmm force-pushed the sdl-audio branch 3 times, most recently from d781cbe to bb275f2 Compare October 16, 2023 15:02
@hwsmm
Copy link
Contributor Author

hwsmm commented Oct 24, 2024

To be honest, I'm a bit unsure about this PR at this point.

It can bring some goodies to the framework, such as:

  • Free audio stack without introducing a new native library (FFmpeg needs to support audio decoding though)
  • All of components except for decode/output are managed, so it should be easy to debug
  • Better support for Linux audio servers, and potential support for niche architectures

However, from what I've seen for an year, most people who want this to get merged want low-latency audio, which this PR fails to deliver due to GC stutters. No improvements on Windows (only good until 10ms buffer), maybe even worse on Linux (good until 20ms) if you want 'good' output.
Gameplay is fine probably because of LowLatency GC option which we can't use for long time.

So, as a solution for this, I have been writing a C library in the meantime as I mentioned above, and it turned out to be quite nice for me so far. Using this can also remove raw audio processing code from this PR, effectively making framework consume an external package for alternative audio backends, but obviously, it's native, so potentially unsafe.

I could have tried improving an existing audio library like I said above, but I ended up making a new one, because decoding needs to happen asynchronously not to delay playback, and it needs to support tempo adjust/bq filters/resampling. I couldn't find any open-source libraries that can do all of them, so I did the 'new standard' thing... that was pretty fun, so I don't regret. It was not that hard because all I did was rewriting what I wrote into C, too.

I know that this isn't the best way, but audio stutter are way more annoying than visual stutters. You just get pop noises into your ears.

I want this PR to be good enough for what users expect, and a viable alternative to BASS for most usecases.
Note that it is not a replacement of BASS as it just re-implements my C# functions in C, so almost everything is one or two call away without converting units and things because it was pretty much made for osu!framework, but it's also standalone, so I guess it makes more sense to maintain and package it in my separate repository and periodically bump version here.

I will have to do some work (and probably a lot of work on mobile toolchains) to actually introduce it here, so I'd like to know how you think about this option.

I understand you even if you don't want this, because I also think this sounds like a step backwards in some ways. Current C# version can still live as a free alternative for the framework.

@smoogipoo
Copy link
Contributor

smoogipoo commented Oct 25, 2024

Beyond just latency, we still have ongoing reports of offset gradually falling behind (long trail starting here ppy/osu-stable-issues#306, stable issue but the same happens on lazer - see referencing issues) and stutters at the start of maps (BASS startup time). The second we can resolve by swapping the reference clock but the first is kind of hard to do anything with - un4seen doesn't work closely enough with us to get anywhere.

As far as I've been told, both of those issues are fixed with SDL3. I'm not against using native code, as long as bulk of the core logic is C#.

@smoogipoo
Copy link
Contributor

Am thinking you might even be able to make a C# NativeAOT project for it if people are uncomfortable with C. I have a bunch of experience setting that up on all relevant platforms including mobile if needed.

@hwsmm
Copy link
Contributor Author

hwsmm commented Oct 28, 2024

Sorry for not responding quickly.

That's actually not surprising since I started writing this backend because I was consistently hitting early on the game no matter what offset I use.

Honestly, I don't really know what to do, so I couldn't reply to your comment on time. I'm fine with maintaining either C or C# version, but if we end up with C# one, I think we'd get another bug report about audio stutters.
The native library unfortunately contains most of core logic to avoid GC stutters as much as possible. It is just an audio library. You create a channel, add it to a mixer, and then it plays.

I guess NativeAOT might work if people dislike using C, but we are not compiling the entire game with it, right? If we are compiling the audio framework in a shared object with NativeAOT, should we make bindings for it, or is there a way to dynamically load a natively AOT-compiled library in C#?

@smoogipoo
Copy link
Contributor

I'm fine with maintaining either C or C# version, but if we end up with C# one, I think we'd get another bug report about audio stutters

The hopeful idea is for the NativeAOT lib to do things in such a way that every allocation is accounted for, while also making it touchable by core developers (who are most experienced in C#).

I guess NativeAOT might work if people dislike using C, but we are not compiling the entire game with it, right?

Nah. I don't think we ever will - the JIT is actually quite useful for us optimisation-wise. But that won't help in this case because the idea is to isolate the osu! GC from the NativeAOT GC.

If we are compiling the audio framework in a shared object with NativeAOT, should we make bindings for it, or is there a way to dynamically load a natively AOT-compiled library in C#?

You need to expose functions using [UnmanagedCallersOnly] (EntryPoint is required) (example). Then you use it just as a normal native lib (example).

@hwsmm
Copy link
Contributor Author

hwsmm commented Oct 28, 2024

Yeah, I agree on the point, too.

But I am worrying about how we should pass objects between .NET and NativeAOT. Do we need to create ObjectHandles for everything, as .NET objects are not blittable?

@smoogipoo
Copy link
Contributor

Yeah. You'd basically interface with it as if it's a normal C lib - there's no passing of objects or hackily dereferencing into a matching managed signature (except for blittable/unmanaged types ofc).

@hwsmm
Copy link
Contributor Author

hwsmm commented Oct 28, 2024

Well, it sounds like rewriting my C library into C# again if I am understanding it correctly.

I think it shouldn't take so long since I already have the .NET part from my native project (unless I run into weird issues). I'll start writing it sooner or later if you are fine with it.

@smoogipoo
Copy link
Contributor

Hold off on that one for a bit. Do you have a sample of the C lib to look at to see what needs to be done?

@hwsmm
Copy link
Contributor Author

hwsmm commented Oct 28, 2024

https://gist.github.com/hwsmm/c2bb0e55694e14c83372b8bdcb73830b

Here are (hand-written) function bindings. Once all of them are implemented, it should work well.
I hope function names are clear enough to tell what they do.

It doesn't have fancy things like SDLBool because I am the only consumer of this library...

@smoogipoo
Copy link
Contributor

I was hoping to see the implementation because people may start to get uncomfortable if the library is very large. In general I think people would feel comfortable as long as most implementation is still managed and the native part is only really fundamental processing.

@hwsmm
Copy link
Contributor Author

hwsmm commented Oct 28, 2024

Oh, if that's the reason, I can send you the compressed source tree. I don't want it to be public if it is not eventually going to be released, also it needs a lot of polishing so...

Can I send you a mail to the address in your GitHub profile?

However, I am not sure about the border of fundamental processing, though. The native part should know all the information that it needs to play audio without relying on C# part to avoid stutters, which includes volume, balance, tempo, frequency and more.

At best, 'Most implementation' can only be about exposing information (duration, channel counts, current time and so on) to the game, telling native channels to play/stop, and creating/destroying native channels when needed.

@smoogipoo
Copy link
Contributor

Can I send you a mail to the address in your GitHub profile?

Yep, that's fine. I'll have a look at it with peppy and see if it's something we're comfortable with the scope/having it.

@hwsmm
Copy link
Contributor Author

hwsmm commented Oct 29, 2024

I've just sent a mail!

Forgot to add in the mail. I am planning to implement decoder in .NET side, because it adds a bit of complexity and may cause link problems with FFmpeg.

@hwsmm
Copy link
Contributor Author

hwsmm commented Aug 27, 2025

Starting from https://discord.com/channels/188630481301012481/589331078574112768/1396539637861847141, I copied most part from this PR to the new library, and I got most things working... on my PC.

As a side note, I was actually a few hours before (pre-)releasing my native library here when smoogi sent me that message since it had been pretty stable lately, so I was not motivated to write it again in NativeAOT, but I ended up deciding to do because it required much less effort because I already had some in C#.

I honestly got surprised with Satori. When I finished the basic implementation and tested with plain NativeAOT, the result was not good on very small buffer (around 1ms).

However, with Satori, it still had some underruns, but was very acceptable than plain NativeAOT and this PR.

Anyway, here are some points to know:

  • It is LGPL because SoundTouch.Net (LGPL v2.1) is statically compiled.
  • BASS is the default decoder because it needs to work for now. It can also use FFmpeg 7 with libswresample and audio decoder if it is available and BASS init code is removed from SDL3AudioManager.cs.
  • Decoder is integrated because I want to keep API stable for my another native library (not public yet).
  • Android probably works (not tested yet), but no iOS because I don't have any Apple devices. Too cumbersome to figure out xcframework when I can only use CI for Apple toolchain.
  • Satori is not available on macOS x86_64 (due to build failure - this upstream issue suggests that the issue has been fixed in new XCode 16 releases but no luck for me) and Android (don't know how)
  • Repository structure is pretty rough right now. I may fix in the future, but it works for now (repo).
  • You can set an environment variable SDL_AUDIO_DEVICE_SAMPLE_FRAMES to adjust latency. 128 should work well on Windows with a good audio driver.

How to try:

  1. git clone -b sdl-audio-alt https://github.com/hwsmm/osu-framework.git
  2. Apply this commit on top of upstream osu
  3. cd osu && ./UseLocalFramework.sh
  4. cd osu.Desktop && dotnet run
  5. Edit AudioDriver in framework.ini to SDL3

This thing is still very early, so it may be (completely) broken in some cases. I'd be grateful if you can try and let me know if it works well.

@hwsmm
Copy link
Contributor Author

hwsmm commented Sep 22, 2025

@smoogipoo I am not expecting an immediate response, but here is a mention in case you didn't enable notification on this thread.

@smoogipoo
Copy link
Contributor

This is still in my emails but very low priority at the moment. I've been waiting on a conclusion to the discussion in #705 + #6628 because @peppy's already reviewing that.

@Rudicito
Copy link
Contributor

Rudicito commented Oct 22, 2025

@hwsmm I tested your changes. Note that I'm using 2025.816.0-lazer, not the latest version, because your framework is a bit outdated.

I'm on Linux (Fedora 42).

I tried with this beatmap. Note that I’ve set a 10 ms audio offset, since that’s what I use in my regular lazer client, that I deactivate beatmap hitsound, and that I use argon pro.

With BASS:

20251022_18h01m46s_grim

With SDL3 (without set SDL_AUDIO_DEVICE_SAMPLE_FRAMES):

20251022_18h02m10s_grim

With SDL3 (SDL_AUDIO_DEVICE_SAMPLE_FRAMES = 128):

20251022_18h56m04s_grim

With SDL3 (SDL_AUDIO_DEVICE_SAMPLE_FRAMES = 64)

20251022_18h43m10s_grim

For SDL3 and SDL_AUDIO_DEVICE_SAMPLE_FRAMES set to 64 and 128, the hitsound feels more responsive, feel like I was often clicking early. Didn't encounter audio issues with these values. In the editor, I feel like the hitsound are early.

With SDL_AUDIO_DEVICE_SAMPLE_FRAMES set to 32, sound start crackling, didn't try playing a beatmap.

Hope this help, you can ask me to test things if you want.

Maybe I should have test with no audio offset? I can do it if you want.

@hwsmm
Copy link
Contributor Author

hwsmm commented Oct 24, 2025

@Rudicito Hey, thanks for testings! Reporting issues you encounter while using it is sufficient for now (you may use whatever offset you want). sdl-audio-alt branch is now updated, so you can probably update your game.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants